NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Distributed-Memory Parallel Algorithms for Sparse Matrix and Sparse Tall-and-Skinny Matrix Multiplication

https://doi.org/10.1109/SC41406.2024.00052

Ranawaka, Isuru; Hussain, Md Taufique; Block, Charles; Gerogiannis, Gerasimos; Torrellas, Josep; Azad, Ariful (November 2024, IEEE)

We consider a sparse matrix-matrix multiplication (SpGEMM) setting where one matrix is square and the other is tall and skinny. This special variant, TS-SpGEMM, has important applications in multi-source breadth-first search, influence maximization, sparse graph embedding, and algebraic multigrid solvers. Unfortunately, popular distributed algorithms like sparse SUMMA deliver suboptimal performance for TS-SpGEMM. To address this limitation, we develop a novel distributed-memory algorithm tailored for TS SpGEMM. Our approach employs customized 1D partitioning for all matrices involved and leverages sparsity-aware tiling for efficient data transfers. In addition, it minimizes communication overhead by incorporating both local and remote computations. On average, our TSSpGEMM algorithm attains 5x performance gains over 2D and 3D SUMMA. Furthermore, we use our algorithm to implement multi-source breadth-first search and sparse graph embedding algorithms and demonstrate their scalability up to 512 Nodes (or 65,536 cores) on NERSC Perlmutter.
more » « less
Full Text Available
Distributed-Memory Parallel Algorithms for Sparse Matrix and Sparse Tall-and-Skinny Matrix Multiplication

Ranawaka, Isuru; Hussain, Md Taufique; Block, Charles; Gerogiannis, Gerasimos; Torrellas, Josep; Azad, Ariful (November 2024, International Conference for High Performance Computing, Networking, Storage and Analysis SC)

Full Text Available
Two-Face: Combining Collective and One-Sided Communication for Efficient Distributed SpMM

https://doi.org/10.1145/3620665.3640427

Block, Charles; Gerogiannis, Gerasimos; Mendis, Charith; Azad, Ariful; Torrellas, Josep (April 2024, ACM)

Sparse matrix dense matrix multiplication (SpMM) is commonly used in applications ranging from scientific computing to graph neural networks. Typically, when SpMM is executed in a distributed platform, communication costs dominate. Such costs depend on how communication is scheduled. If it is scheduled in a sparsity-unaware manner, such as with collectives, execution is often inefficient due to unnecessary data transfers. On the other hand, if communication is scheduled in a fine-grained sparsity-aware manner, communicating only the necessary data, execution can also be inefficient due to high software overhead. We observe that individual sparse matrices often contain regions that are denser and regions that are sparser. Based on this observation, we develop a model that partitions communication into sparsity-unaware and sparsity-aware components. Leveraging the partition, we develop a new algorithm that performs collective communication for the denser regions, and fine-grained, one-sided communication for the sparser regions. We call the algorithm Two-Face. We show that Two-Face attains an average speedup of 2.11x over prior work when evaluated on a 4096-core supercomputer. Additionally, Two-Face scales well with the machine size.
more » « less
HotTiles: Accelerating SpMM with Heterogeneous Accelerator Architectures

https://doi.org/10.1109/HPCA57654.2024.00081

Gerogiannis, Gerasimos; Aananthakrishnan, Sriram; Torrellas, Josep; Hur, Ibrahim (March 2024, IEEE)

Full Text Available
SPADE: A Flexible and Scalable Accelerator for SpMM and SDDMM

https://doi.org/10.1145/3579371.3589054

Gerogiannis, Gerasimos; Yesil, Serif; Lenadora, Damitha; Cao, Dingyuan; Mendis, Charith; Torrellas, Josep (June 2023, International Symposium on Computer Architecture (ISCA), June 2023.)

Full Text Available

Search for: All records